🐿️ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
🔧 Systems-level optimizations for LLM serving
R-ConstraintBench: Evaluating LLMs on NP-Complete Scheduling
arxiv.org·3d
✨Model optimizations in LLMs
Inside NVIDIA Blackwell Ultra: The Chip Powering the AI Factory Era
developer.nvidia.com·4d·
Discuss: r/hardware
📊AI Performance Profiling
Subjective Behaviors and Preferences in LLM: Language of Browsing
arxiv.org·3d
🧠Large Language Models (LLMs)
Time-Optimal Directed q-Analysis
arxiv.org·3d
✨Model optimizations in LLMs
MoEcho: Exploiting Side-Channel Attacks to Compromise User Privacy in Mixture-of-Experts LLMs
arxiv.org·3d
🧠Large Language Models (LLMs)
Benchmarking LLM-based Agents for Single-cell Omics Analysis
arxiv.org·5d
🤖Agents using LLMs
Hydra: A 1.6B-Parameter State-Space Language Model with Sparse Attention, Mixture-of-Experts, and Memory
arxiv.org·3d
🧠Large Language Models (LLMs)
EMNLP: Educator-role Moral and Normative Large Language Models Profiling
arxiv.org·3d
🧠Large Language Models (LLMs)
Multiple Memory Systems for Enhancing the Long-term Memory of Agent
arxiv.org·3d
🤖Agents using LLMs
From 5G RAN Queue Dynamics to Playback: A Performance Analysis for QUIC Video Streaming
arxiv.org·3d
💬Prompt optimizations for LLM serving
Comp-X: On Defining an Interactive Learned Image Compression Paradigm With Expert-driven LLM Agent
arxiv.org·3d
🔢Quantization of LLMs
Active Learning for Neurosymbolic Program Synthesis
arxiv.org·3d
🧠Large Language Models (LLMs)
Artificial Intelligence-Based Multiscale Temporal Modeling for Anomaly Detection in Cloud Services
arxiv.org·4d
⚙️AI Infrastructure Automation
DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization
arxiv.org·4d·
Discuss: Hacker News
✨Model optimizations in LLMs
Online Incident Response Planning under Model Misspecification through Bayesian Learning and Belief Quantization
arxiv.org·4d
✨Model optimizations in LLMs
Unplug and Play Language Models: Decomposing Experts in Language Models at Inference Time
arxiv.org·3d
🧠Large Language Models (LLMs)
Content Accuracy and Quality Aware Resource Allocation Based on LP-Guided DRL for ISAC-Driven AIGC Networks
arxiv.org·6d
📊AI Performance Profiling
Integrated Take-off Management and Trajectory Optimization for Merging Control in Urban Air Mobility Corridors
arxiv.org·3d
✨Model optimizations in LLMs
Benchmarking Computer Science Survey Generation
arxiv.org·3d
📊AI Performance Profiling
Non-linear Welfare-Aware Strategic Learning
arxiv.org·3d
✨Model optimizations in LLMs
Loading...Loading more...
AboutBlogChangelogRoadmap